## Update statistics for results.
## [1] "MGS"
## [1] "/Users/au618299/programming_projects/minkowski_theory/code/instances/results/algorithm2/MGS-prob-2-100|100-ll-2_1.json"
## [2] "/Users/au618299/programming_projects/minkowski_theory/code/instances/results/algorithm2/MGS-prob-2-100|100-ll-2_2.json"
## [3] "/Users/au618299/programming_projects/minkowski_theory/code/instances/results/algorithm2/MGS-prob-2-100|100-ll-2_3.json"
## [4] "/Users/au618299/programming_projects/minkowski_theory/code/instances/results/algorithm2/MGS-prob-2-100|100-ll-2_4.json"
## [5] "/Users/au618299/programming_projects/minkowski_theory/code/instances/results/algorithm2/MGS-prob-2-100|100-ll-2_5.json"
## [6] "/Users/au618299/programming_projects/minkowski_theory/code/instances/results/algorithm2/MGS-prob-2-100|100-mm-2_1.json"
## # A tibble: 4 × 1
##   filename                                 
##   <chr>                                    
## 1 prob-2-200|200|200|200-mmmm-4_2.json     
## 2 prob-2-300|300-mm-2_1.json               
## 3 prob-2-300|300-mm-2_2.json               
## 4 prob-2-300|300|300|300|300-lllll-5_2.json

Total time used in computing minimum generator sets (MGS)

Hours used: 649.5443305

IP problems solved: 4

Note:

For all the experiments the minimum generator sets are unique. In particular every generator \(\mathcal{G}\) with corrosponding set \[ (\mathcal{Y}^1, \mathcal{Y}^2... \mathcal{Y}^S)\] we have: \[ \mathcal{Y} = \left\{ y^s \vert \exists y \in \mathcal{Y}_N, c_s = y^s, \forall c \in \mathcal{C}(y) \right\} \] This also implies that the found minimum generator sets are unique minimal generator sets. This is not generally the case as seen in example [ref generatorNotUnique]. It implies however that minimal generator sets are likely to be unique.

STATUS

## [1] 1270
## # A tibble: 5 × 2
##   filename                                  method
##   <chr>                                     <chr> 
## 1 prob-5-300|300|300|300|300-mmmmm-5_1.json m     
## 2 prob-5-300|300|300|300|300-mmmmm-5_2.json m     
## 3 prob-5-300|300|300|300|300-mmmmm-5_3.json m     
## 4 prob-5-300|300|300|300|300-mmmmm-5_4.json m     
## 5 prob-5-300|300|300|300|300-mmmmm-5_5.json m

Solved 1270 of 1280`

MGS size as a function of maximum possible size.

MGS Size (with min/max of MGS sets

##    [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
##   [38] 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5
##   [75] 5 5 5 5 5 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3
##  [112] 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5
##  [149] 5 5 5 5 5 5 5 5 5 5 5 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3
##  [186] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5
##  [223] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
##  [260] 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
##  [297] 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 2 2 2 2 2 2 2 2 2 2 2 2 2
##  [334] 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4
##  [371] 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 2 2 2 2 2 2 2
##  [408] 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4
##  [445] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 2
##  [482] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
##  [519] 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
##  [556] 5 5 5 5 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3
##  [593] 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5
##  [630] 5 5 5 5 5 5 5 5 5 5 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3
##  [667] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5
##  [704] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
##  [741] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
##  [778] 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2
##  [815] 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4
##  [852] 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 2 2 2 2 2 2 2 2
##  [889] 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4
##  [926] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 2 2
##  [963] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
## [1000] 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
## [1037] 5 5 5 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3
## [1074] 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5
## [1111] 5 5 5 5 5 5 5 5 5 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3
## [1148] 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5
## [1185] 5 5 5 5 5 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3
## [1222] 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5
## [1259] 5 5 5 5 5 5 5 5 5 5 5 5
## Levels: 2 3 4 5
## # A tibble: 16 × 8
##    p     m     size_50 size_100 size_200 size_300 mean_MGS_size mean_Yn_size
##    <fct> <fct>   <dbl>    <dbl>    <dbl>    <dbl>         <dbl>        <dbl>
##  1 2     2       0.824    0.783    0.738    0.691         0.759        1004.
##  2 2     3       0.789    0.745    0.692    0.714         0.735        3307.
##  3 2     4       0.710    0.640    0.636    0.637         0.656        5972.
##  4 2     5       0.694    0.589    0.575    0.576         0.609        8921.
##  5 3     2       0.948    0.909    0.868    0.844         0.892        3039.
##  6 3     3       0.895    0.860    0.831    0.819         0.851       22645.
##  7 3     4       0.849    0.800    0.759    0.745         0.788       89163.
##  8 3     5       0.813    0.760    0.684    0.696         0.738      253840.
##  9 4     2       0.994    0.977    0.959    0.946         0.969        6675.
## 10 4     3       0.975    0.948    0.918    0.903         0.936       85007.
## 11 4     4       0.959    0.922    0.880    0.857         0.905      521603.
## 12 4     5       0.938    0.892    0.844    0.821         0.874     2115251.
## 13 5     2       1        0.998    0.994    0.988         0.995       12267.
## 14 5     3       1.00     0.989    0.971    0.960         0.980      273210.
## 15 5     4       0.998    0.977    0.953    0.934         0.966     2530229.
## 16 5     5       0.996    0.971    0.934    0.885         0.946     7620738.

Consider only vectors removed from u and l subproblems:

## # A tibble: 32 × 5
## # Groups:   p, method [8]
##    p     method avg_sp_size mean_absolute_u_avg mean_relative_u_avg
##    <fct> <chr>        <dbl>               <dbl>               <dbl>
##  1 2     u               50               44.7               0.893 
##  2 2     u              100               77.6               0.776 
##  3 2     u              200              159.                0.796 
##  4 2     u              300              254.                0.846 
##  5 2     ul              50                5.96              0.119 
##  6 2     ul             100                4.72              0.0472
##  7 2     ul             200                4.21              0.0210
##  8 2     ul             300                3.56              0.0119
##  9 3     u               50               48.2               0.963 
## 10 3     u              100               91.8               0.918 
## # ℹ 22 more rows

## [1] "44%"
## [1] "74%"
## [1] "84%"

For the instances l the MGS tend to consist of all subproblem vectors. Also for the u instances a large number of the subproblem vectors are used in the minimum generator sets, however the spread is significantly higher than that of l, and in some cases for u the MGS consists of only half the total subproblem vectors.

The instances m and ul both contain a larger amount of unsupported and supported vectors. [TODO Show how many approx 50% of each?]. The difference between the instances are that for m each subset consists of a mix of supported and unsupported vectors, while the ul instances consists of subsets which have either many unsupported (u), or many supported vectors (l). For m and ul we find that the MGS is unlikely to consist of all the subproblem vectors. On average the MGS consists of only 72.79% of the total subproblem vectors for m instances and 73.58% for ul instances. This implies that around 27% of the generated subproblem vectors are not needed for generating the nondominated sum.

Relative size of MGS wrt. average subproblem cardinality

In [above] we see that the relative size of the MGS for the instances m and ul decrease in the total number of subproblem vectors and in the number of subproblems.

For the instances l we see that in almost all instances the MGS consists of all subproblem vectors.

The u exibit a different pattern from the rest. Here we find that for instances \(p>3\) the MGS tend to consist of all subproblem vectors. For the remaining instances we find that like ul and m the size of MGS is decreasing in the number of subproblems.

Relative size of MGS wrt. total subproblem cardinality

size of Yn (like for alg1)

classifications

Number of US vectors in MGS relative to the number of US vectors in the subproblems.

(Problemer er alle samme størrelse)

M tilfældet ligner (phd) understøtter hypotesen at flere ikke-støttede punkter medfører mindre generator set. “ny opdagelse” forskelligheden mellem delproblemer betyder også noget. Sammenlign u og ul, (kun u-subproblemer). Hvis de ligger oveni hinanden, men har næsten alene ikke-støttede punkter, kan vi ikke nødvendigvis fjerne punkter. Alstå har formen betydning ikke alene antallet af ikke-støttede punkter. Når delproblemerne ligger tæt på hinanden kan vi ikke fjerne noget, dette kan undersøges nærmere.

Når p stiger falder nytten af MGS. Når m stiger, øges nytten (for fast p).

For m og ul ses det at flere sp punkter medfører mindre MGS. (Overvej argument, ellers fjern kommentar) Forbehold: Vi tester kun på (ens) subproblemer. Samme størrelse

Kommentar: Generering af instanser har stor betydning på resultaterne. Antal ikke-støttede punkter er isoleret set ikke sigende for MGS

Number of supported but non-extreme vectors removed

Validation

Check that each generator set includes all extreme vectors:

## # A tibble: 1,270 × 9
##    prob_sizes_se_total prob_sizes_us_total prob_sizes_sne_total
##                  <dbl>               <dbl>                <dbl>
##  1                 166                  33                  167
##  2                 161                  38                  162
##  3                 162                  34                  166
##  4                 167                  33                  167
##  5                 164                  35                  165
##  6                  15                 185                   15
##  7                  18                 182                   18
##  8                  20                 180                   20
##  9                  20                 180                   20
## 10                  21                 179                   21
## # ℹ 1,260 more rows
## # ℹ 6 more variables: MGS_sizes_se_total <dbl>, MGS_sizes_us_total <dbl>,
## #   MGS_sizes_sne_total <dbl>, MGS_size <dbl>, prob_total_extreme <dbl>,
## #   MGS_total_extreme <dbl>
## # A tibble: 1,270 × 7
##    prob_sizes_se prob_sizes_us prob_sizes_sne MGS_sizes_se MGS_sizes_us
##    <chr>         <chr>         <chr>          <chr>        <chr>       
##  1 83-83         16-17         84-83          83-83        16-17       
##  2 83-78         17-21         83-79          83-78        17-21       
##  3 79-83         20-14         80-86          79-83        20-14       
##  4 84-83         16-17         84-83          84-83        16-17       
##  5 86-78         14-21         86-79          86-78        14-21       
##  6 7-8           93-92         7-8            7-8          57-54       
##  7 8-10          92-90         8-10           8-10         49-42       
##  8 8-12          92-88         8-12           8-12         61-48       
##  9 12-8          88-92         12-8           12-8         48-61       
## 10 10-11         90-89         10-11          10-11        41-49       
## # ℹ 1,260 more rows
## # ℹ 2 more variables: MGS_sizes_sne <chr>, MGS_sizes <chr>
## # A tibble: 1,270 × 7
##    prob_sizes_se_total prob_sizes_us_total prob_sizes_sne_total
##                  <dbl>               <dbl>                <dbl>
##  1                 166                  33                  167
##  2                 161                  38                  162
##  3                 162                  34                  166
##  4                 167                  33                  167
##  5                 164                  35                  165
##  6                  15                 185                   15
##  7                  18                 182                   18
##  8                  20                 180                   20
##  9                  20                 180                   20
## 10                  21                 179                   21
## # ℹ 1,260 more rows
## # ℹ 4 more variables: MGS_sizes_se_total <dbl>, MGS_sizes_us_total <dbl>,
## #   MGS_sizes_sne_total <dbl>, MGS_size <dbl>

Add to alg1

Add to alg1:

All tests are run on a Linux (CentOS 7) cluster with cores @ 3.1 GHz (AMD EPYC 9554).

The [algorithm1] was implemented in Python 3.11. The script calls a c-implementation of the LimMem algorithm described in [ref Klamroth] which allows a user-defined maximum memory allocation, instead of storing the entiry Minkowski sum and subsequently filtering out the dominated vectors. A memory limit of 64 GB. The LimMem algorithm computes the nondominated sum of two sets. The c-implementation was provided by Bruno Lang.

Draft

In this subsection we study the size of MGS for the instances described in [ref to table]. Here we omit the test instances, which only consists of subproblem vectors generated using method [l]. This is because almost all subsets generated with method [l] consists exclusively of extreme supported vectors which necessarily are included in any MGS following [prop x].

## # A tibble: 4 × 2
##   filename                                  MGS_size
##   <chr>                                        <dbl>
## 1 prob-2-200|200|200|200-mmmm-4_2.json           209
## 2 prob-2-300|300-mm-2_1.json                     261
## 3 prob-2-300|300-mm-2_2.json                     267
## 4 prob-2-300|300|300|300|300-lllll-5_2.json     1422

In total all \(900\) instances was solved. From these only 64 instances did not satisfy the condition [Reduced = fixed, line reference] and of these only 4 did not satisfy the generating property [generating property, line reference]. For these instances an IP problem was solved. As such 1266 problems had a unique generator set following [prop fixed=reduced => unique].

(TODO? We can check if the remaining 3 instanced also have unique MGS using no-good constraints).

For each instance we are interested in the size of the MGS. To compare the size of MGS for varying number of subproblems and varying subproblem sizes we consider the MGS size relative to the size of the trivial generator set (the total amount of subproblem vectors). The average relative MGS size over all instances was was 85.0%, implying that on average 15.0% of the subproblem vectors where not needed to generate the nondominated sum. The relative size of MGS ranges from 15.2% to 100.0% over all 1270 instances. In the following we investigate to which extend the relative MGS size depends on the factors: subproblem sizes, number of subproblems and number of objectives. Finally, we will investigate how the MGS size depend on the proportion of unsupported vectors in the subproblems.

We find that the relative MGS size decreases in both the number of subproblems and in the size of subproblems. Both factors increase the denominator in the relative size [formular for relative size]. To compare the two effects we consider the instances where the total number of subproblem vectors are the same but the number of subproblems and the number of subproblem vectors vary. eg. instances with 2 subproblems of 100 vectors have the same total subproblem vectors as the instances with 4 subproblems each containing 50 vectors. In total we compare 360 instances with 200, 300 and 600 total subproblem vectors, partitioned into instances with varying number of subproblems.

As presented in Table[above] the size of MGS for instances with many smaller subproblems was consistantly smaller than those of instances with few, but larger subproblems. On average MGS consisted of 89.1% for instances with a few, but large subproblems while the size of the MGS instances with many smaller subproblems consisted of 85.6%.

When looking at the effects of the number of objective functions on the relative MGS size we get 69.0% (p=2), 81.7% (p=3), 92.1% (p=4) and 97.5% (p=5). Here we see a clear tendency that the number of subproblem vectors required for a MGS increases in the dimension of the objective space. This might be a direct consequence of the higher proportion of extreme supported vectors in higher dimension [refer to general comment “large dimension => many extreme vectors” by Serpil Sayın?]. To see this we note that the proportion of supported vectors also increase in the number of objective functions 30.3% (p=2), 46.4% (p=3), 52.8% (p=4) and 60.0% (p=5).

We will now consider the effect of the number of unsupported subproblem vectors on the MGS size. From [prop X] we know that all extreme subproblem vectors must be included in any MGS. Because of this we hypothesize that the size of the MGS decreases in the relative number of unsupported subproblem vectors. To test the hypothesis we use the generation methods \(l,m,ul\) as indicators for the amount of unsupported vectors. We expect the MGS to be smallest in the case \(u\) which have the highest proportion of unsupported vectors. Surprisingly, we find that the MGS size is largest for the instances where all subproblems are generated using the \(u\) method. We do not see the expected negative correlation between the proportion of unsupported vectors 6.7% (l), 74.4% (m), 84.4% (u) and 44.2% (ul), and the average relative size of MGS 99.9% (l), 72.8% (m), 93.6% (u) and 73.6% (ul). This surprisingly shows that MGS are larger in the instances where almost all subproblem vectors are unsupported. This might be an artifact of the generation method chosen for the \(u\) subproblems. Two subproblems generated using \(u\) method will lie on the same circle, while vectors generated using the \(m\) method will lie between the same two hyperplanes.

(TODO add scalling argument)

If we instead fix the generation method and look at the relative size of MGS as a function of the proportion of unsupported vectors we find a strong relationship between the proportion of unsupported vectors and the size of the MGS. We get the following \(R^2\) values 16.9% (l), 89.7% (m) and 6.0% (u). However, we get a similar fit if we instead use the average subproblem size as the explanitory variable for the relative MGS size. Here we get the \(R^2\) values 16.9% (l), 89.7% (m) and 6.0% (u).

Acknowledgement